Wang et al. BMC Plant Biology 2012, 12:148 
http://www.biomedcentral.com/1471-2229/12/148 



Plant Biology 



RESEARCH ARTICLE 



Open Access 



Construction of a high-density genetic map for 
grape using next generation restriction-site 
associated DNA sequencing 



Nian Wang 1 , Linchuan Fang 1,2 , Haiping Xin', Lijun Wang" 5 and Shaohua Li' 



Abstract 

Background: Genetic mapping and QTL detection are powerful methodologies in plant improvement and 
breeding. Construction of a high-density and high-quality genetic map would be of great benefit in the production 
of superior grapes to meet human demand. High throughput and low cost of the recently developed next 
generation sequencing (NGS) technology have resulted in its wide application in genome research. Sequencing 
restriction-site associated DNA (RAD) might be an efficient strategy to simplify genotyping. Combining NGS with 
RAD has proven to be powerful for single nucleotide polymorphism (SNP) marker development. 

Results: An F1 population of 100 individual plants was developed. In-silico digestion-site prediction was used to 
select an appropriate restriction enzyme for construction of a RAD sequencing library. Next generation RAD 
sequencing was applied to genotype the F1 population and its parents. Applying a cluster strategy for SNP 
modulation, a total of 1,814 high-quality SNP markers were developed: 1,121 of these were mapped to the female 
genetic map, 759 to the male map, and 1,646 to the integrated map. A comparison of the genetic maps to the 
published Vitis vinifera genome revealed both conservation and variations. 

Conclusions: The applicability of next generation RAD sequencing for genotyping a grape F1 population was 
demonstrated, leading to the successful development of a genetic map with high density and quality using our 
designed SNP markers. Detailed analysis revealed that this newly developed genetic map can be used for a variety 
of genome investigations, such as QTL detection, sequence assembly and genome comparison. 
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Background 

Grape (2n = 38) is one of the most important fruits 
worldwide, with a production of -68 million tons over a 
harvested area of 7.2 million ha in 2010 (FAOSTAT, 
2010). Grapes can be classified into either table or wine 
varieties, based on their intended mode of consumption, 
i.e., eaten raw or used to make wine. Consumption of 
grapes and wine has proven to be greatly beneficial for 
human health [1-4], and there has been a recent rise in 
the demand for high-quality grapes for human consump- 
tion. There is therefore a need to focus on grape im- 
provement to optimize their attractive characteristics, 
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such as contents of secondary metabolites, sugars and 
organic acids, resistance and yield. This can be achieved 
by using different germplasms from domesticated or 
wild-type grapes and then selecting for the genetic com- 
ponents that control the superior traits. However, it 
takes decades to produce advanced high-performing 
grape cultivars with the required traits, and there is still 
not enough resource which can produce grapes with 
high quality and quantity. High-density genetic map, one 
of the most valuable genomic resources, can largely re- 
veal genome compositions and meet the requirement of 
high throughput superior traits selection among a lot of 
germplasms in most species, including plant and animal. 
Thus, construction of a high-quality genetic map for 
grape is necessary for its further studies and production. 

In the past two decades, there have been a number of 
reports on the construction of grape genetic maps. Lodhi 



o 



© 2012 Wang et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative 
BiolVlGCl C6ntTcll Commons Attribution License (http://creativecommons.Org/licenses/by/2.0), which permits unrestricted use, distribution, and 
reproduction in any medium, provided the original work is properly cited. 



Wang et al. BMC Plant Biology 2012, 12:148 
http://www.biomedcentral.com/1471-2229/12/148 



Page 2 of 15 



et al [5] developed a genetic map for Vitis with 422 ran- 
dom amplified polymorphic DNA (RAPD) and 16 re- 
striction fragment length polymorphism (RFLP) molecular 
markers, as well as a number of isozyme markers [5], pos- 
sibly the first report of a complete genetic map for grape. 
From that study, a number of new genetic maps were 
developed, several of them based on the framework of that 
map. The latter studies generally made use of an Fl popu- 
lation as the plant material, with amplified fragment length 
polymorphisms (AFLP), simple sequence repeats (SSR), 
and single nucleotide polymorphisms (SNP) being the 
three major molecular marker types for map construction 
[6-16]. Although some genetic maps for grapes already 
exist, the total marker number on the linkage groups (LGs) 
of these existing maps is generally < 1,000 and some of 
these mapped markers have no sequence information. 
Thus a high-density genetic map for grape is still lacking, 
and one that covers a large number of molecular markers 
with sufficient sequence information is needed to meet the 
demand for improvement. 

A key step in genetic map construction is the develop- 
ment of a set of testable molecular markers. In the last 
decade, a number of molecular marker technologies 
have been developed, including RAPD, AFLP, SSR and 
SNP. RAPD and AFLP have proven to be unstable due 
to many uncontrollable experimental conditions [17]. 
SSRs are considered to be one of the most stable and re- 
liable markers for genetic map construction, but the 
experiments are time- and cost-consuming [18]. Thus, 
these markers are not suitable for high-density genetic 
map construction with high throughput. SNPs are single 
nucleotide polymorphisms or small InDels in the gen- 
ome. They can be more numerous than other types of 
markers, but this is difficult to test. Before next gener- 
ation sequencing (NGS) technique was developed, a 
number of other platforms were available for their iden- 
tification, such as SNP Gene-Chip [19], high-resolution 
melt (HRM) analysis [20], TILLING and EcoTILLING 
[21,22]. With the improved sequencing technology, the 
last two years have seen the development of NGS 
combining restriction-site associated DNA (RAD) for 
SNP testing [23]. Pfender et al. [24] successfully used 
RAD markers to construct a high-density genetic map, 
which was subsequently employed to detect the QTL 
for resistance to stem rust in Lolium perenne. Using 
2,383 RAD prior markers, an ultra-high-density genetic 
map was also developed for barley by Chutimanitsakun 
et al. [25], who showed that next generation RAD se- 
quencing is a powerful high-throughput technique. Next 
generation RAD sequencing has also been successfully 
applied in other plants, including globe artichoke [26] 
and eggplant [27] . 

In this study, an Fl population of grape was con- 
structed by crossing two interspecies hybridization 



progeny, Z180 and Beihong. Analyses of resveratrol con- 
tent in the fruit skin, sugar and acid contents in the 
berry, berry size and cold resistance over several years 
revealed stable segregation of these traits in this Fl 
population. To take advantage of this Fl population, a 
high-density genetic map was constructed using next 
generation RAD sequencing for genotyping. The > 1,500 
SNP markers contained in this map were analyzed, and 
aligned with the reference grapevine genome. Conse- 
quently, additional information on the genomic struc- 
tures of different Vitis species was obtained, and the 
map can also be used to identify marker-linked loci that 
potentially control the superior traits of the two parents. 

Materials and methods 

Mapping population and DNA extraction 

The Fl mapping population consisted of 100 progeny 
from a cross of Z180 (V monticola x V riparia) and 
Beihong (V vinifera x V amurensis) in 2003. Since pollen 
abortion occurred in Z180, Beihong was employed as the 
male parent. The seedlings of the two parents and their 
progeny were planted in the vineyard of the Germplasm 
Repository at the Institute of Botany of the Chinese 
Academy of Sciences in Beijing. 

Young leaf samples (second and third leaves from the 
apex) were harvested from each individual Fl plant and 
the two parents at the beginning of the vegetative period 
(late spring). The samples were immediately stored in li- 
quid nitrogen and transferred to a -70°C freezer. Young 
leaves (0.5 g) from each plant were ground in liquid ni- 
trogen and their DNA extracted using the DNeasy plant 
mini prep kit (Qiagen). DNA concentration was mea- 
sured and adjusted to the same level. 

In-silico analysis of restriction enzyme-recognition sites 
on the reference grape genome 

The sequence of the Vitis vinifera Pinot noir PN40024 
12x genome assembly was downloaded from the inter- 
national Grape Genome Browser (http://www.genoscope. 
cns.fr/externe/GenomeBrowser). Recognition sequences 
of 30 common restriction enzymes (data not shown) were 
chosen to investigate their digestion sites in the reference 
genome using Perl script. Total number of digestion sites, 
length of the resultant fragments, and their distribution 
were calculated from the results of the in-silico analysis. 

Sample preparation and data analysis 

Sample preparation for sequencing followed that in a 
number of published papers for NGS combined with 
RAD [23-25,28], with a few modifications. Illumina 
Solexa adapters (2006 Illumina, Inc., all right reserved.), 
largely unmodified, were used for library construction. 
In brief, 2 ug genomic DNA from each sample (100 Fl 
progeny and both parents) was treated with 20 units (U) 
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Msel (New England Biolabs [NEB]) for 60 min at 37°C 
in a 50 \A reaction. A quick blunting kit (NEB) was used 
to convert 30 \A of the digested sample to 5'-phosphory- 
lated, blunt-ended DNA in a 50- [il reaction mixture; the 
reaction was performed with 30 \A of digested sample, 
5 ul 10X blunting buffer, 5 ul 1 mM dNTP mix, 2 ul 
blunting enzyme mix and 8 ul sterile dH 2 0 at room 
temperature for 30 min. A 3' -adenine overhang was 
added to the resulting samples in a 50- ul reaction with 
32 ul blunt-ended DNA sample, 5 ul Klenow buffer 
(10X), 10 ul dATP (1 mM), 3 ul Klenow fragments 
(3' — > 5' exo", 5U/ul) and sterile dH 2 0 to the final vol- 
ume at 37°C for 1 h. Then 2 [A of 100 nM PI and P2 
adapter with a 3- to 5-bp plant-specific index (barcode) 
at the 5' end and a thymine overhang at the 3' end was 
added to each sample in a 50- ul reaction. The sequence 
of PI and P2 adaptors: P1F: 5^-ACACTCTTTCCC 
TACACGACGCTCTTCCGATCTxxxT-3' (xxx indicated 
barcode); P1R: 5' phos-yyyAGATCGGAAGAGCGTCGT 
GTAGGGAAAGAGTGT-3' (yyy, reverse complement 
of xxx); P2F: 5' phos-AGATCGGAAG AGCGGTTCAG 
CAGG AATGCCG AG-3' ; P2R: 5^-CTCGGCATTCCTG 
CTGAACCGCTCTTCCGATCTT-3\ A ligation reac- 
tion was carried out overnight at 16°C with T4 DNA lig- 
ase and 16 samples with different plant indices were 
pooled into one. DNA fragments from 400 to 500 bp 
(including the ~120-bp adaptor) were separated on a 
1.5% agarose gel and purified using a MiniElute gel ex- 
traction kit (Qiagen). Finally, all pooled samples were 
amplified with Phusion High- Fidelity PCR Master Mix 
(NEB) for 18 cycles in a 100-ul reaction including 20 ul 
Phusion master mix, 5 ul of 10 uM modified Solexa 
amplification primer mix (API and API; 2006 Illumina, 
Inc. , all right reserved) and sterile dH 2 0 to the final vol- 
ume. The API and AP2 primers contained Illumina se- 
quencing primer sites. The sequences are: API: 5'-AAT 
GATACGGCGACCACCGAGATCT ACACTCTTTCCC 
TACACGACGCTCTTCCGATCT- 3^ ; P2: 5^-CAAG 
CAGAAGACGGCATACGAGAT CGGTCTCGGCATT 
CCTGCTGAACCGCTCTTCCGATCT- 3^ ; the under- 
lined sequences are identical to Illumina sequencing pri- 
mer sites. PCR products were repurified using the 
QIAquick PCR purification kit (Qiagen) and sequenced 
on a genome analyzer II instrument. All of these experi- 
ments were performed at Beijing's Biomarker Technolo- 
gies Co. Ltd. (http://www.biomarker.com.cn/english/). 

SNP identification and Fl plant genotyping were per- 
formed according to the method of Pfender et al. [24], 
with a few modifications. A number of Perl scripts (Bio- 
marker Technologies Co. Ltd.) were programmed to 
conduct the analysis. In brief, low-quality data were dis- 
carded (five bases with Q score < 20) first, and Solexa 
sequences were assigned to the 102 plants according to 
their given index. The first 30 bp of each read 



(designated as RAD tags) were employed for subsequent 
analysis. For SNP marker identification, a cluster analysis 
was performed for both parents' data together. RAD tags 
were compared and nearly identical tags, with one or 
two mismatches (SNPs or 1- to 2-bp InDels), were 
assigned to one cluster. Clusters with > 200 or < 5 reads 
were discarded. More than one mismatch on the 30-bp 
sequence of the same RAD tag was considered a haplo- 
type and regarded as one potential SNP marker in the 
subsequent analyses. In one RAD tag cluster, mis- 
matches among different plants in the Fl population 
were considered putative polymorphisms and the differ- 
ent mismatches were regarded as multiple alleles. The 
parental genotypes for each RAD tag cluster were also 
analyzed according to the origin of the 30-bp tag se- 
quence. To genotype all 100 Fl plants, their 30-bp 
sequences were also clustered and analyzed separately 
following the strategy applied in the cluster analysis of 
the two parent RAD tags. The genotypes for each RAD 
tag cluster of a single Fl plant were then determined by 
the identity between them and the corresponding clus- 
ters in the two parents. 

Linkage map construction 

Because of the lack of an anchor marker in this study, 
we first identified a set of SNP markers to assign the 19 
grapevine chromosomes tol9 LGs. This was performed 
in two steps: 1) we marked the segregation patterns of 
all 1,814 SNP markers as ab x cd, ef x eg, hk x hk, lm x 11 
and nn x np. Three types of markers, ab x cd, ef x eg and 
hk x hk, which could be mapped to both parental linkage 
maps, were regarded as candidate anchor markers; 2) 
the two representative 30-bp sequences (because all 
alleles of a SNP marker had two nearly identical 30-bp 
sequences, we could take the sequence of any allele 
representing the genotype of this SNP marker) of these 
candidate anchor markers were aligned with the se- 
quence of the 12x genomic assembly of V. vinifera Pinot 
noir PN40024 using local BLAST software. The positions 
of each sequence for one SNP marker on the genome 
were identified by their highest number of hits. Three 
strict criteria were used to screen the candidate anchor 
marker: 1) it had to show a normal segregation ratio 
among the 100 Fl progeny; 2) both 30-bp end sequences 
had to align with the same chromosome position in the 
reference PN40024 genome; 3) the distance between the 
positions for the two end sequences on the reference 
genome had to fall between 200 and 500 bp (the 
expected size of the digested fragments was -300- 
400 bp). The strategy for alignment of RAD tags with 
the reference genome was also used for the 1,646 SNP 
markers with the 19 chromosomes for subsequent 
comparison. 



Wang et al. BMC Plant Biology 2012, 12:148 
http://www.biomedcentral.com/1471-2229/12/148 



Page 4 of 1 5 



The double pseudo-test cross strategy of Grattapaglia 
and Sederoff [29] was applied, using JoinMap® 4.0 soft- 
ware, during the map construction. After data had been 
imported, a "CP" model was used for data mining. The 
ratio of marker segregation was calculated by Chi-square 
test. Markers showing significantly distorted segregation 
(P- value < 0.001) were excluded from the map construc- 
tion. The genotypes of the 1,814 SNP markers were ana- 
lyzed for linkage and recombination by applying the 
Kosambi function to estimate genetic map distances. To 
group all 1,814 markers, logarithm of odds (LOD) score 
thresholds > 7 were used. After the LGs had been com- 
puted, their number was assigned according to the an- 
chor markers mapped on them. The integrated map for 
both male and female plants was computed using the 
'Combine Group for Map Integration' function. 

Results and discussion 

Selection of suitable restriction enzymes for RAD 
sequencing library construction 

In this study, we did not sequence the whole genome of all 
Fl plants; rather, we sequenced the two ends of the -300- 
to 400-bp RAD tags to simplify the grape genome and in- 
crease sequencing efficiency. Thus, selection of a suitable 
restriction enzyme for DNA digestion was key. Theoretic- 
ally, two characteristics are required for an appropriate re- 
striction enzyme: 1) because the NGS technology can only 
cover 75 to 100 bp of DNA at each end concurrently, the 
enzyme must be able to digest the genome of interest to a 
suitable size (e.g. -300-400 bp); 2) the number of digested 
fragments of the expected size should be sufficient for sub- 
sequent manipulation (100,000-150,000 RAD tags). The V. 



vinifera Pinot noir PN40024 genome sequence was taken 
as the reference to search for an appropriate restriction 
enzyme. 

In-silico digestion with -30 restriction enzymes 
showed great differences in recognition sites (data not 
shown). One restriction enzyme, Msel, which recognized 
4 nucleotides (T/TAA), was predicted to produce 
149,921 digested DNA fragments for a grape genome of 
300-400 bp in size, suiting our requirements. The distri- 
bution of binding sites for this restriction enzyme is 
shown in Figure 1. Based on these results, we selected 
Msel as the restriction enzyme to construct the DNA se- 
quencing library. 

SNP markers and their characteristics 

Once the DNA of the Fl individuals and their parents 
had been treated with MSel, all samples were genotyped 
by high- throughput sequencing. In total, -16 G of raw 
sequence data containing 117,084,991 pair-end (PE) 
reads was obtained, with each read being -70 bp in length. 
To avoid sequence errors, only reads showing < 5 bases 
with Q score > 20 were further analyzed. Of these high- 
quality data, -149 Mb were from one of the parents, 
Beihong, with 2,136,496 reads, and -148 Mb were from 
Z180 with 2,126,872 reads. To assign these reads to their 
corresponding loci, a cluster strategy was used for the two 
parents' data (described in Materials and Methods). As 
the grape genome harbors a large number of repeat 
sequences [30,31], these might affect the coverage calcula- 
tion and lead misidentification of polymorphisms. To 
overcome this obstacle, clusters containing highly redun- 
dant reads were excluded (clusters with > 200 reads), 
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Figure 1 Distribution of the restriction enzyme Msel's predicted digestion sites. X axis indicates the size of the digested fragments; Y axis 
indicates the number of fragments. 
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which removed the repeat sequences from the data. Clus- 
ters with a low number of reads were also excluded due to 
little coverage of the loci (clusters with < 5 reads). Finally, 
37,871,193 high-quality reads without repeat sequences 
were retained, and were assigned to 80,709 clusters for the 
whole Fl population (Table 1). Thus we obtained 80,709 
valid loci representing the whole grape genome. This 
number was less than the expected number of digested 
fragments (100,000-150,000); however, it excluded the re- 
peat sequences and thus roughly corresponded to the in- 
silico digestion result. Further calculation indicated that 
the coverage of these loci was -469-fold at the population 
level (number of valid reads: 37,871,193 per number of 
clusters: 80,709). With the aim of screening polymorphisms 
for these 80,709 clusters, a strict in-silico procedure was 
carried out for SNP identification (described in Materials 
and Methods). In total, 21,599 clusters showed more than 
one genotype according to their sequence diversity in the 
whole Fl population (Table 1). This indicated an average 
26.8% polymorphism rate for the Fl population. A total of 
11,144,665 reads were involved in these polymorphic loci 
and thus the average coverage was -516-fold at the popu- 
lation level. In addition, we calculated the polymorphic 
loci for each Fl plant and its parents. According to 
Figure 2, we obtained an average of -12,840 reads 
involved in the polymorphic loci and thus a 17.0-fold 
coverage per cluster per each individual. The reads num- 
ber involved in the polymorphic loci ranged from 10,912 
to 13,649 and the coverage ranged from 7.7 to 41.5-fold 
(Figure 2). 

As already noted, the main advantages of NGS tech- 
nology are low cost and high throughput. However, it 
also has a very serious disadvantage in its high probabil- 
ity of sequence error [32]. To overcome this problem, 
high coverage of a specific sequence must be obtained. 
We digested the DNA and only then sequenced the 
RAD tags, greatly reducing the size of the genome. 
Jaillon et al. (2007) claimed that grapevine harbors a se- 
quence that is -470 Mb. During the genotyping of our 
102 plants, we only manipulated -80,709 valid clusters 
and each contained an -70-bp sequence. Thus the grape 
genome was simplified to -5.65 Mb (80,709 x 70 bp). 
This amounts to an -83-fold reduction compared with 
the original 470 Mb reference genome, resulting in the 
requirement of very little data to achieve high coverage. 
According to our data, the average coverage for each tag 
was 17.0-fold in an individual plant. Moreover, because 



Table 1 SNP modulation for the Fl population 





Clusters 


No. of reads 


Coverage 


Polymorphisms 


21,599 


11,144,665 


515.98 


Non-polymorphisms 


59,110 


26,726,528 


452.15 


Total 


80,709 


37,871,193 


469.23 



all sequence tags were from the two parents, Beihong 
and Z180, the number of alleles for each locus was < 4. 
The total coverage for each tag at the population level 
was -469-fold, leading us to adjust the SNPs in some 
loci where their coverage in an individual plant was in- 
sufficient. In addition, with these and subsequent strict 
criterions, we found the coverage of clusters correspond- 
ing to final SNP markers on the genetic map showed al- 
most larger than 7 in an individual plant; only 24 
showed from 5- to 7-fold coverage. Based on the above 
analyses, we concluded that the applied strategy provides 
high-throughput and high-quality identification of SNPs. 

There were a number of possible patterns for the poly- 
morphic markers in an Fl population (ab x cd, ef x eg, 
hk x hk, lm x 11, nn x np and aa x bb). However, the last 
pattern, aa x bb, could not be applied to the genetic map 
construction due to its lack of segregation in our Fl 
population, even though it probably constituted the lar- 
gest proportion of all marker types. Thus, calculation of 
the segregating patterns for all loci would be necessary be- 
fore a linkage map could be constructed. In addition, des- 
pite a high average coverage for the predicted RAD tag 
clusters, there were still a number of RAD tag clusters 
with low coverage in some Fl plants. To increase the ac- 
curacy of our data, only the clusters showing three or 
more fold coverage of > 80% of the Fl plants were used 
for subsequent development of SNP markers. We 
screened all 21,599 polymorphic clusters based on the 
above criteria and obtained 1,814 valid SNP markers with 
segregating patterns of ab x cd, ef x eg, hk x hk, lm x 11 or 
nn x np (note that if two polymorphic clusters came from 
the same Msel-digested fragment, they were regarded as 
one marker). In addition to the coverage of the se- 
quence data, the integrity for each locus among these 
100 Fl individuals and their two parents was a key par- 
ameter in controlling map quality. We therefore investi- 
gated the data on missing rate for these plants, and 
found full integrity for the two parents, Z180 and Beihong, 
and 92.3% integrity on average for the 100 Fl plants. For a 
single SNP marker, the lowest integrity was -85%, meeting 
the requirement for LG construction. Of these 1,814 
SNP markers, 1,545 were homozygous for one parent 
and heterozygous for the other (960 for lm x 11 and 585 
for nn x np), constituting 85.2% of all selected SNP 
markers. However, the other three types of markers that 
could be mapped on both female and male linkage 
maps only amounted to 14.8% (ab x cd: 77, ef x eg: 171 
and hk x hk: 21). This indicated that at most, 269 SNP 
markers could be used as shared markers for the inte- 
gration of the two parents' maps into one. 

Because all of the SNP markers in this study were 
uniquely developed and no LG information was avail- 
able, we identified a set of anchor markers that would 
indicate their chromosomal location. As described in 
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Figure 2 Valid read number and coverage for each plant in the F1 population and their parents. The X axis in a and b indicates the 
accession, including the two parents and their average one; the Y axis in a indicates read number, and in b, cluster (locus) coverage. 
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Materials and Methods, the chromosome location of the 
269 markers with ab x cd, ef x eg and hk x hk segrega- 
tion patterns were detected according to their sequence 
alignment to the grape reference genome. After a series 
of strict selections and calculations, 212 markers clearly 
showed their chromosome location (Additional file 1: 
Table SI). Of these anchor markers, two were located on 
random chromosomes because the grape genomic se- 
quence has not been completely assembled. The lowest 
number of anchor markers was on chromosome 15, with 
only two being usable for map construction (Additional 
file 1: Table SI). The average number of anchor markers 
for each chromosome was -11.2 and only one chromo- 
some had < 5 markers. This indicated that these anchor 
markers were sufficient for LG assignment. 

Genetic maps 

When the data preparation was complete, the 1,814 SNP 
markers were imported into JoinMap4.0 for map con- 
struction. In total, 1,121 markers fell into 19 LGs for 
Z180 (female), 759 markers for the Beihong (male), and 
1,646 markers for the integrated map, with a grouping 
LOD value of 7 to 13 (Figures 3, 4, 5, and 6, Additional file 



2: Figure SI and Additional file 3: Table S2). The differ- 
ence in the number of markers between Z180 and Bei- 
hong might indicate the heterozygosity of Z180 is larger 
than Beihong; and it is corresponding to the result of an 
ongoing research which is conducting in our group for in- 
vestigation of diversity among different vitis germplasm 
(unpublished). For these 19 LGs, the Z180 LG08 and Bei- 
hong LG14 did not form a uniform bar, but divided into 
two short LGs. Of the 212 anchor markers, 19 did not 
map to either Z180 or Beihong LGs, and 5 markers were 
specific to Beihong LGs. Thus 188 markers could be 
mapped on both Z180 and Beihong maps (Table SI). Fur- 
ther analysis of the location of the anchor markers 
revealed that their assignment to each chromosome by 
alignment to the reference genome and by LG clustering 
was identical. This suggested conservation of the genome 
structure among different species and the accuracy of our 
genotyping data. 

Taking into account the size of all LGs, marker 
coverage amounted tol, 884.3 cM for Z180 (female), 
1,740.5 cM for Beihong (male), and 1,917.3 cM for the 
integrated map (Table 2). The average intervals between 
two adjacent mapped markers were 1.68 cM, 2.29 cM 
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Figure 3 Integrated linkage groupl to 5 for Z180xBeihong. 
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and 1.16 cM for the Z180, Beihong and integrated maps, 
respectively. The total physical size of the grape genome 
was -470 Mb [30,31], meaning that each 1,000-kb DNA 
sequence was equal to an average of -4.0 cM genetic 
distance in this study. Though we found there was no 
significant correlation between genetic and physical size 
in the subsequent analysis, the data still could indicate 
that the average intervals between two adjacent mapped 
markers on their genome were -420 kb (1.68/4.0 x 
1000) for Z180, -573 kb for the Beihong, and 290 kb for 
the integrated map. Comparing previous reports of vitis 
genetic map, the total marker number on the linkage 
groups (LGs) of these existing maps is generally < 1,000 
[6-16], therefore, the density for linkage maps developed 
for the Fl population of Z180 x Beihong was very high. 
In addition, the total sizes of grape genetic map ranged 
from -1100 to -1700 cM in previous study [6-16] and 
were much smaller than our map. More markers applied 
and interspecies crossed Fl population in this study 
might be attributed to this difference. More markers ap- 
plied in the genetic map could detect more recombin- 
ation, whereas, interspecies cross could produce more 
recombination. Further analysis revealed that the mar- 
kers on these 19 LGs were not evenly distributed. The 
maximum number of markers occurred on LG18, with 



95 markers for the female, 74 for the male and 148 for 
the integrated map. The minimum number of markers 
occurred on LG 15—15 for Z180, 22 for Beihong and 34 
for the integrated map. The size of the LGs also varied 
widely (Table 2): the longest LGs were LG05 for Z180 
(133.2 cM), LG07 for Beihong (122.8 cM) and LG13 for 
the integrated map (118.5 cM); the shortest were LG15, 
LG11 and LG11 for Beihong, Z180 and the integrated 
maps, with 57.4 cM, 76.3 cM and 79.2 cM, respectively. 
Compared with the physical size of the corresponding 
chromosomes [31], the longest and shortest chromo- 
somes were LG18 and LG17 with 34.4 and 17.9 Mb, re- 
spectively. The different physical and genetic rankings of 
the LGs led us to investigate the correlation between the 
two. Both females and males showed a very weak correl- 
ation (r = 0.25) between genetic and physical size among 
these 19 LGs/chromosomes, which might indicate that 
different recombination rates exist on the different chro- 
mosomes during meiosis. 

A number of future studies can be based on the high- 
density genetic map developed in this work. First, several 
excellent traits exist in one of the two parents. Thus, a 
given trait might be improved by selection of markers 
which are linked to elite loci or alleles after QTL detec- 
tion. Moreover, several excellent traits might be 



Table 2 Genetic map for 19 linkage groups (LGs) 







Number of markers 






Genetic sizes (cM) 




Chromosome 
size (Mb) a 


Female (Z180) 


Male (Beihong) 


Integrated map 


Female (Z180) 


Male (Beihong) 


Integrated map 


LG01 


63 


30 


85 


118.0 


80.1 


112.2 


23.6 


LG02 


46 


31 


68 


101.1 


79.5 


108.9 


18.7 


LG03 


55 


40 


78 


80.0 


79.0 


89.0 


20.5 


LG04 


71 


35 


94 


94.1 


83.7 


93.7 


23.9 


LG05 


71 


32 


93 


133.2 


100.5 


108.8 


25.4 


LG06 


62 


38 


88 


90.4 


93.9 


107.6 


21.5 


LG07 


87 


58 


133 


113.7 


122.8 


116.5 


22.4 


LG08 


79 


53 


117 


103.3 


92.6 


109.5 


22.4 


LG09 


51 


30 


70 


85.6 


100.1 


94.4 


23 


LG10 


66 


42 


98 


108.9 


112.4 


114.6 


18.8 


LG11 


28 


23 


44 


92.2 


76.3 


79.2 


20.0 


LG12 


64 


49 


94 


103.3 


97.5 


108.1 


24.2 


LG13 


83 


53 


115 


120.4 


91.1 


118.5 


27.4 


LG14 


53 


39 


80 


107.5 


81.5 


92.1 


30.3 


LG15 


15 


22 


34 


57.4 


85.5 


65.1 


20.3 


LG16 


28 


27 


47 


83.1 


87.2 


87.3 


22.8 


LG17 


50 


53 


87 


91.8 


76.9 


88.8 


17.9 


LG18 


95 


74 


148 


104.8 


111.8 


108.5 


34.4 


LG19 


54 


30 


73 


95.5 


88.3 


114.5 


24.02 


Total 


1121 


759 


1646 


1884.3 


1 740.5 


1917.3 


441.6 



a the physical sizes are according to Jaillon et al. (2007). 
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combined in one grape plant, thereby producing a new 
cultivar, through a series of crosses and marker- assisted 
selection (MAS). Second, compared to other genetic maps 
for grape, there are two obvious advantages: high density 
and complete sequence information for all markers 
(Additional file 3: Table S2). These advantages could 
greatly benefit comparative mapping and genome assem- 
bly. The markers' combined 60-bp sequences mapped to 
the LGs could be used as anchors for the genome. Al- 
though the genome sequence of grapevine was published 
several years ago, it still has a number of gaps and random 
sequences [30,31]. In this study, a set of markers could be 
aligned to the random chromosomes of V. vinifera Pinot 
noir PN40024 (data not shown). According to their posi- 
tions on LGs, it might be easy to put the random chromo- 
somes into the common one. On the other hand, the 
published grape genome is only for V. vinifera, and the 
genome structures of different Vitis species are expected 
to be more or less different due to the long evolutionary 
history of the Vitaceae [33] . Thus, comparing the genome 
characteristics of the different species could give us a bet- 
ter understanding of grape. The 1,646 mapped markers' 
combined 60-bp sequences could be used as shared 
anchors to compare genetic and physical maps (Additional 
file 3: Table S2). These studies might facilitate use of the 
grape genomic resource. 

Comparison of genetic and physical maps 

To compare the genetic and physical maps, we investi- 
gated the locations of all 1,814 SNP markers on the 
reference genome. The high-quality 30-bp sequences 
from both ends of each SNP marker were employed for 
the location search by aligning them to the reference 
genome. A total of 1,456 SNP markers showed a match 
between their two ends and the same positions (intervals 
of 200-500 bp) on the reference genome; 106 markers 
only showed a match for one end to one position on the 
reference genome, while the other end had no match; 
the remaining 252 markers showed no match to the 
reference genome, showed a conflict in matching posi- 
tions for the two ends, or were mapped on the random 
genome. To increase accuracy, only the first type of mar- 
kers (1,456 SNP markers) was used to compare the gen- 
etic and physical maps. 

From Tables 3 and Additional file 3: Table S2, 892 
common markers were found between the physical and 
Z180 (female) genetic map; 606 common markers were 
found between the physical and Beihong (male) genetic 
map. This indicated that 79.6% (892/1,121) of the mar- 
kers on the female LGs could be mapped on the refer- 
ence genome; similarly, 79.8% of the markers on the 
male LGs could be mapped on the reference genome. 
Among the 19 chromosomes or LGs, LG18 showed the 
highest number of common markers between the 



physical and genetic maps for Z180 and Beihong (75 and 
61, respectively); LG15 showed the lowest number of 
common markers, only 13 for the Z180 map and 15 for 
the Beihong map. To compare the order of the common 
markers, a dot-plot diagram (Figure 7) was generated 
using the physical position of each common marker on 
the reference genome against its genetic position on the 
LGs; at the same time, all LGs of the two parental maps 
were aligned with the reference genome (Additional file 
4: Figure S2). According to these two analyses, most of 
the markers showed good linear agreement between 
physical and genetic maps on the basic framework. 
However, there were also chromosomes showing re- 
arrangement of some regions. Among the 19 LGs, 
ChrOl, 03, 04, 05, 06, 08 (two LGs for male), 09, 10, 12, 
13, 14, 17, 18, 19 showed high collinear results for both 
female and male maps. The remaining LGs only showed 
high collinear results for one map. Because both parents 
were produced by interspecies crosses (V. monticola x V. 
riparia and V. vinifera x V. amurensis), some of the 
regions in the two parent genetic maps might be identi- 
cal to the reference genome (V. vinifera); nevertheless, 
most of the regions are expected to come from the other 
three Vitis species. Therefore, the same order for the two 
types of map most probably indicates conservation of 



Table 3 Number of common markers between genetic 
and physical maps for 19 individual chromosomes 



Marker 
names 




Number of markers 


Z180 


Beihong 


LG01 


55 


24 


LG02 


36 


18 


LG03 


50 


36 


LG04 


50 


31 


LG05 


64 


30 


LG06 


53 


32 


LG07 


54 


40 


LG08 


68 


41 


LG09 


43 


27 


LG10 


35 


27 


LG11 


25 


20 


LG12 


50 


37 


LG13 


62 


45 


LG14 


45 


31 


LG15 


13 


17 


LG16 


21 


22 


LG17 


47 


42 


LG18 


75 


61 


LG19 


46 


25 


Total 


892 


606 
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Figure 4 Integrated linkage group6 to 9 for Z180xBeihong. 



Wang et al. BMC Plant Biology 201 2, 1 2:1 48 Page 11 of 1 5 

http://www.biomedcentral.eom/1 471-2229/1 2/1 48 



LG10 



LG11 



LG12 



LG13 



LG14 



70 I 

«e 
735 

JA B 
7B-i 

rs a 
ei i 

81 3 

82 8 

83 3 
65 0 
665 
si-, ,■■ 
87 C 
87 | 
89? 
60 i 
50 9 
92 4 

Ma 
04 a 
5fi 1 
573 
B7J 
9BB 
59? 
100 1 
100 9 
102 5 
102 5 




5NP4646 
SNP3268 
SNP2307 
S.NP6C4t 
SNP6775 
SNP5S18 

SNP41636 SMP17554 
SNP13697 
SNP39363 
SNP4323 

SNPsies 

SNP7975 

SNP22659 

SNP1550B 

SNP29247 

SNP58765 

SNPG0 1 2 

SNP4S67 

SNP6691 

SNP47113 

SNP64925 

SNP1 1951 

SNP21170 

SNP1 1995 

SNP573 

SNP2671 

SNP16526 

SNP46821 

5NP2294 

SNP1711 

5NP4S277 

SNP35217 

SNP9995 

SNP3553 

SNP51290 

SNP 1089 

SNP10969 

SNP15501 

SNP4271 

SNP33662 

SNP1 107 

SNP53049 

SNP9O06 

SNP15554 

SNP485S9 

SNP2&6S 

SNP4245 

SNP4&01Q 

SNP 102 3« 

SNP20121 

SNP21D8B 

SNP3632 

SNP3467 

SNP201S0 

SNP1 1051 

SNP67692 

SNP953 

SNP3027B 

SNP42295 

SNP501 1 

SNP12139 

SNP4072 

SNP51425 

SNP6004 

SNP4876 

SNP21378 
SNP7070 
SNP6381 
SNP6392 
SNP16394 
SNP4936 
SNP2626 
SNP2635 
SNP6028 
SNP174&9 
SNP32304 
SNP4&04 
SNP1 1841 
SNP4772 
SNP49697 

3NP41523&MP72732 

SNP14B28 

SNP6161 

SNP1244 

SNP5361 

SNP5M13 

SNP5126 

SNP1269 

SNP4492 

SNP30182 

SNP2S34 

SNP24244 

SNP33650 

SNP3S740 

SNP21 7-55 



33- 

4 b 

5 7 



12 6- 
13.5- 
15 2- 

154 

ie.5- 
22.1 ■ 

24 ? 

268- 
28 9 
295- 



569- 
57 9 
39 3- 



46 3- 

47 2- 
48.2- 
560- 
61 7 

53 1 - 

54 6- 
553 - 



W5 X / 

61 5 - y 



-SNP8281 

■ 5NP35355 
5NP26914 

-SNP3168 

■ SNP75232 

■ SNP9288 
- SNP3269 

SNP6541 

SNP21 10 
SNP4&65 

■ SNP9464 

■ SNP5763 

-SNP55C3 
•SNP14415 
-SNP2887 



SNP6322 
SNP7&68 
■ SNPB23 



SNP22515 

SNP 19552 

■ SNP2286 
SNP11172 
SNP5S17 

■SNP21B2 

■ SNP22C4 
SNP7659 
SNP18741 




SJ6 

8 1 

e t 

9 1 

10 5 

11 7 

12 1 
120 

13 1 
13 9 
1J 4 
M 5 

155 

16 '0 
16 1 
177 

18 5 

19 0 

19 5 

20 3 

20 6 

21 3 

21 5 

22 0 
22 1 

22 5 

23 4 

24 1 

24 7 

25 8 

26 1 
274 

27 7 

28 5 

28 7 

29 6 

30 3 
3 I 5 
320 
34 1 
353 
37 7 
39 0 
4 1 D 
41 8 
46 0 

46 7 

48 6 

hi 1 

hi 3 

52 7 
M 2 
BBJ 
565 
57 3 

ii6 5 

UO 4 

63 6 

64 4 
M S 
8? 5 
BU 
,•'0 0 

71 7 

72 5 
72 B 

74 1 

75 4 
76,6 
7^5 
78 7 
75 0 
HO hi 

a? 4 

HS 1 
r!^ 

09 :■ 
Bl l 
94 9 

% a 

1016 
1086 



SNP24324 

SNP8422 

SNP46479 

SUP 3550 

SNP3772& 

SWP15791 

5MP9113 

SWP56416 

SMP32Q66 

5MP5584 

SNP61234 

SWP7296 

•".I - --:" ■ 

SNP1126fi 

SNP3122 

3NP3253 

SNP8780 

SMP20325 

SMP22230 

SWP41742 

SWP6677 

SMP39407 

SMP4083 

SMP924 

SNP206? 

SNP11264 

SUP 30 744 

SNP3681 

SNP3491 

SNP 5 4^61 

SHP69766 

SMP12343 SMP17891 

SWP5326 

RNP3D04 

SNP6167 

S.MP34QG 

SNP15615 

SWP16290 

SNP 39800 

SNPfO^?? 

SWP11331 

SNP82955 

SWP23933 

&MP3396 

SNP12SS0 

SMP 12899 

SNP5334 

SWP19967 

SNP21655 

SNP24870 

BNP'0804 

SWP2O340 

SWP3700 

SNP61607 

SMP24141 

SMP60441 

SMP75935 

SNP41 14 

SMP751 1 



SNP'5/4? 

S«P42858 

SNP1447 

SNP722BO 

SNP24341 

SN Pin 06B 3 

SNP47665 

3NP4B' ! 

SNP2490 

SNP12S38 

SNPS947 

SNP3023 

SWP9S52 SNP 17732 
SNP561A1 



SNF--5453 

SNP21425 

S«P5704 

SNP3H6 

EMPE7Q8 

SNP245U 

SNP25128 

SMP56837 

SMP27143 

5NP25773 

SHP54455 

SNP371S 

SNP7335 

SNP5359 

SNP4MB3 

SNP6761 7 



00 - 

9 5 
10 5 
12 5 
15 1 
19 0 
22 5 

25 7 

26 6 
2^4 

28 0 

29 B 
81 2 

32 1 

33 2 
33 5 

35 0 

36 6 

37 9 

39 7 

40 S 

41 0 
■13 0 
4J 5 

■:a a 

47 6 

48 3 
5 J 4 

63 5 
55 9 
5*2 

58 7 
62 5 
03 2 
0-4 4 

05 6 

06 0 
0.'2 
0*4 

08 6 
69 5 

09 S 
.•0 6 

71 3 

72 0 
72 7 

72 D 

73 9 

74 1 
74 6 

74 0 

75 7 

76 9 
76 5 
76 8 

7TJ3 

1 r 4 

78 0 
78 4-3 

78 6 

79 2 
79 4 
79 5 
HO 0 
00 5 
3- 1 

a/? 

026 

a? 9 

64 a 

54 h 

54 9 

ar>fi 

568 
08 9 

a? 7 

a9 3 

59 4 
69 9 
90 3 
9' 3 

9? 7 

93 9 

94 r, 
■■■>'■> 

95 7 

96 3 

97 3 

97 7 

98 7 

98 9 
95 5 

99 9 
100 8 
1016 
1018 

102 2 

103 0 
1036 

104 2 
104 8 

106 4 

107 2 
1078 

108 5 

109 3 
HO 1 
1125 
113,5 
1136 
115 9 
1185 



3NP'd05C 

SNP7271 

&NP13324 

SMP3006 

SMP3626 

SMP4990 

SMP9751 

9NPS88B 

SMP3743 

&MP10253 

SNP1220 

SNP19319 

SNP3677Q 

&NP7&40 

SMP5791 

SMP10168 

SMP3B168 

&NP12774 

S=MP2487 

SMP37252 

S N Pi 1 602 

SMP3417 

SMP16053 

SNP37521 

SMP2O430 

SMP2435 

&NP19799 

SMP2283 

SMP23920 

5MP42664 

SNP41213 

SNNQ97S 

SMP51 759 

5MP24765 

&MP10439 

SNP32QO 

SMP7256 

SNP7O410 

SMP9255 

&NP1750 

&NP17Q33 

SMP 34849 

3MP53S 

5hlP56459 

5MP16131 

SNP75S3 

SMP34496 

SWP76213 

SMP37376 

SWP2597 

&NP3161 

SMP8467 

SNP 11005 

SMP277B 

5MP1765 

SNP16637 

SNP71942 

SNP9812 

SIMP60S5 

SMP4173 

SNP43144 

SMP3649 

SHP5441S 

SNP17425 

SNP55143 

SNP33S1S 

SNP3658 

SNP2051 

EW3306S 

SMP28956 

SNP?3142 

SNM6108 

SMP6563 

SHP5564 

SNP 1956 

SNP201& 

SNP42464 

SNP48B33 

5NP3050 

SNP26995 

SNP11202 

&NP12038 

SHP5414 

SNP25100 

SNP42127 

3NP379S 

SNP61IB01 

SNP5645 

SMP634 

SNP2&82 

SNP501B 

SMP8999 

SMP?3Ci74 

&NP5H9 

SMP9556 

SNP3627 

SNP2916S 

&NP47139 

SNP22605 

SKP'4986 

SNP20149 SNP39734 

SNP5431 

SNP10165 

SNP2378 

SNP3&7B0 

5NP28979 

&NP3613 

&NP7145 

SNP2001 

SNP27587 

SMP50944 

SNP33559 

SMP1663S 

SNP5312 




SNP ? 3803 
SWP40605 
SWP15010 
SWP41635 
5NP76277 
SMP2502 
SWP35919 
SWP46665 
SMP9434 

SWP47217 SNP7577 

SMP1336' 

SWP52165 

SMP4730 

SNP86836 

SNPM97 

SNP1 1600 

SWP1306 SNP331S 

SNP 10752 

SNP 10635 



3MP40I J5 
\ SWP11093 



SNP5477 

SNP 13673 

SNP 15074 

SNPS813 

SNP46574 

SNP 13466 

SNP243B7 

SNP7043 SNP33839 

SNP61119 

SrjP5S608 

SNP9502 

SNP52015 

SNP 16279 

SNP 73O0 

SUP 7750 

SNP44553 




Figure 5 Integrated linkage groupIO to 14 for Z180xBeihong. 
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Figure 6 Integrated linkage group15 to 19 for Z180xBeihong. 
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Figure 7 Collinear analysis of the consensus between genetic and physical maps. The X axis indicates the physical position of each SNP 
marker; the Y axis indicates the genetic position of each SNP marker. Red diamonds indicate the female genetic position against the reference 
physical position; black diamonds indicate the male genetic position. 



genomes among the different grape species; the non- 
collinearity for some chromosome regions might indicate 
some variations among different grape species during 
evolution. 

Moreover, using the high-quality, high-density genetic 
map, we could investigate not only the variation in genome 
structure among different species but also the variation 
that occurred during interspecies crosses. There have been 
a number of studies on interspecies hybridization in the 
past decade. Chromosome rearrangement, retrotransposon 
activation and SSR mutations have been seen in interspe- 
cies crosses between different types of Brassica species 
[34]. For our plant material, the parents came from two 
separate interspecies crosses (V. monticola x V. riparia and 
V. vinifera x V. amurensis), and thus the population con- 
tained four grape pedigrees. An overview comparative ana- 
lysis of the genetic map and the reference genome 
(Figure 7) reveals a number of markers in some regions 
that were not in the same order. Moreover, we achieved a 
similar result by comparing the Z180, Beihong and 



integrated genetic maps (Additional file 2: Figure SI). The 
variations among the different species might be the first 
reason for this non-uniformity; however, genomic variation 
occurring due to Vitis interspecies crosses might also exist 
because we observed variations in the positions of a num- 
ber of markers as a uniform block between the male gen- 
etic map and the V. vinifera physical map; the male parent 
(Beihong) harboured half of the V. vinifera pedigree. Thus, 
with our detailed and complete investigation of the genetic 
map, more knowledge of the variation among different 
species and interspecies crosses can be obtained in the 
future. 

Conclusions 

We constructed a genetic map of a Z180 x Beihong Fl 
population of high density and quality. According to the 
analysis of the SNPs and their sequence information, we 
conclude that next generation RAD sequencing is a power- 
ful strategy for genotyping. With further characterization 
of the genetic map, variations and conservation between 
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the genetic map and reference genome were clearly 
detected. This genetic map is expected to be useful for 
QTL detection, sequence assembly and genome structure 
comparisons. 
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